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Abstract 

We study particular patterns in planar rooted binary trees. In par- 
ticular we will consider those subtrees having the caterpillar property. 
The size of the biggest caterpillar subtree becomes then a new param- 
eter with respect to which we find several enumerations. 

1 Introduction 

In this work we want to study particular patterns in planar rooted 
binary trees. More precisely we will consider what seems to be a new 
statistic on this well known class of trees. We are interested in the size 
of the biggest subtree having the caterpillar property. 

Caterpillars have already been considered in the case of coales- 
cent trees, see for example the interesting work of Rosenberg |4]. In 
particular, in a population genetic framework, when trees are used 
to represent ancestry relations among individuals, the presence of a 
caterpillar subtree often correspond to interesting phenomena such as 
natural selection. 

The problem of considering subtrees structures is not new, see for 
example [1] and [5]. Up to our knowledge Caterpillars have not yet 
been considered in the context of planar rooted binary trees. Their 
study here can be also considered as an introductory step to further 
works concerning the realization of caterpillars in non-planar rooted 
binary trees. Indeed we believe possible to extend the main approach 
of this paper to the more difficult non-planar case. 

After giving some basic definitions, we will provide the enumeration 
for the number of planar rooted binary trees of a given size having the 
biggest caterpillar subtree of size less than (resp. greater than, equal 
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to) a fixed integer k. Furthermore we will provide the expected value 
of the size of the biggest caterpillar subtree when trees of size n are 
uniformely distributed. 

Finally, in Section[5]we will see how caterpillars subtrees correspond 
to patterns extracted from 132-avoiding permutations. The result- 
ing characterization seems quite interesting and should deserve further 
studies. 



2 Definitions 

Planar rooted binary trees are enumerated with respect to the size, 
i.e. number of leaves, by the well known sequence of Catalan num- 
bers corresponding to entry A000108 in The respective generating 
function C{x) is the following 



The class of planar rooted binary trees will be denoted by T while 
Tn will represent the subset of T made of those elements having size 
n. In what follows we will use the term tree referring to planar binary 
rooted trees. 

We define a tree in 7^ to be a caterpillar of size n if each node is a 
leaf or it has at least one leaf as a direct descendant. See for example 
Fig. [11(a) (b). 

Caterpillars can be also characterized by the fact that they are the 
most unbalanced trees. As a measure of tree imbalance we take the 
following index. Given a tree t and a node i, let ti(i) (resp. tr{i)) be 
the left (resp. right) subtree of t determined by i. We define 

At{i) = |size(t/(«)) - size(i,.(i))|. 

If t E Tn its Colless's index (see [3]) is defined as 



(n-2)(n-l) ^ 

^ ' ' i node oft 

The Colless's index is considered as a measure of tree imbalance 
(see [3]). Its value ranges between and 1, where corresponds to a 
completely balanced tree while 1 to an unbalanced one. 

From the previous definitions it turns out that a tree of size n> 1 
is a caterpillar if and only if its Colless's index is 1. 

If < S 7^ we define ^(t) as the size of the biggest caterpillar which 
can be seen as a subtree of t. We observe that, if n > 1, then ^{t) is 
at least equal to two. In Fig. [5] we have depicted a tree having 7 = 5. 
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3 A recursive construction for the size of 
the biggest caterpillar subtree 

Let (x) be the ordinary generating function which gives the number 
of trees having the 7 parameter at most equal to fc > 2. 
It is easy to see that satisfies the equation 

F- ^x + {F-f -2''-\''+\ (1) 

Indeed a tree t with ^{t) < k has either size one or it is made of two 
trees ti and t2 attached to the root such that 7(^1) < k and 7(^2) < k. 
We must exclude the case in which one between ti and ^2 has size 1 
and the other one is a caterpillar of size k. Since there are exactly 2*^"^ 
caterpillars of size k the previous formula follows. 

From dl]) we obtain 



1 - Vl - 4.x + 2* 



Then considering Fj^" — C{x) — Ff,_^{x) one has the number of trees 

having 7 > fc while taking Fk = F]^{x) ^ ^k-ii^) '^^^ '^^^ compute 
the number of trees of a given size having ^ — k. The following table 
shows the first coefficients of the Taylor expansion of F^ , F^ and Fk 
when k — b. 
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16 


64 


240 


832 



Note that the sixth coefficient of Fk is 0. Indeed, as the reader can 
easily check, there is no tree of size k + 1 having the 7 parameter equal 
to k. 

We conclude this section observing that none of the sequences cor- 
responding to FjT , F^ and Fk seems to be present in [6] . 



3.1 Asymptotic growth of trees with no pitchforks 

The function Fk {x) is analytic except when x is a solution of the equa- 
tion 1 — Ax + 2^^^^ x^'^^ — 0. By Pringsheim's theorem (see [2]) we can 
assume, for our purposes, that the dominant singularity of Fk{x) cor- 
responds to the positive real solution of 1 — 4a; -I- 2^^'^x^'^^ = which 
is closer to the origin. Let pk be this solution. We observe that, when 
k increases, pk approaches 1/4. In order to prove this claim we remark 
that, for /c > 2, we have 
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Indeed this can be shown by considering the polynomial 

y=l-4:x + 2''+'^x''+^ 

which satisfies y(l/4) > and 2/(2/5) < 0. Furthermore y is also 
decreasing between and 1/4 as it can be seen by solving the equation 
y'(x) = which gives x = y^4/ (2'=+i(fc + 1)) > 1/4. We now proceed 
by bootstrapping (see [2])- Writing the defining equation for pk as 

x = i(l + 2^-+ia;'^+i) 
4 



and making use of ([2]) yields next 



2fe+i 



— < < 



which is sufficient to prove that pk — > 1/4. 

A further iteration of the previous inequality shows that 



Pfc < 7 I 1 + 2 



k+l 



1 



fe+lN 



which, considering that (4/5)*^+^ ~ 0, gives 



Thus 



k+l 



which means 



-4- 

10 V5 



Pk 



1 



0\k 



2k+3 ■ 1 ^ \^5^ 
In the following table we show the first approximated values of pk ■ 



P2 


0.3090169 


P3 


0.2718445 


Pi 


0.2593950 


P5 


0.2543301 


P6 


0.2520691 


P7 


0.2510085 
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For a given constant a we can always write 

1 - 4a; + 2''+^x''+^ (a - x)(4 - 2'=+^ ^ a'x''~') + 1 - 4a + 2^+^a^+\ 

i=0 

then, substituting the solution pk to a we have 

fc 

1 - 4.T + 2^+^x''+^ = {pk - x){4 - 2'=+! ^ pI-t'^-O. 

i=0 

Defining 

k 

B{x)=4-2'+'Y.Pk^'~' 

i=Q 

and by standard asymptotic calculations (see [5]) we have 

1 / 4pfc~(fc + i)2^+v^:+i / 1 y 

where n — )■ oo. 

We can apply the result in ([3]) to provide the asymptotic behaviour 
of trees with no caterpillar of size 3. Caterpillars with three leaves are 
also called pitchforks in 

Proposition 1 The number of pitchfork-free trees of size n is given 
by [x''^]F2 and it satisfies asymptotically the following relation: 

1 I AR-2Am ( 1 \" 

a\I 7rn3 \B.) ^ 

where R = i(V5 - 1) = 0.3090169. 

When n — 100 the ratio between [a:^'^"]i^2^ and its approximation 
is 0.9933. 

4 The average size of the biggest caterpil- 
lar subtree 

In this section we want to determine £"„ (7) which denotes the average 
value of the parameter ^{t) when t G Tn- 
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As showed in Section |3l when fc > 0, FjT (x) gives the number of 
trees having 7 at most k. Indeed, also in the case A; = 1, we have 
F[' = (1 — Vl — 4a; + 4a;2)/2 = x. Where x represents the unique 
caterpillar of size 1 . 

Furthermore consider /^"^ — [a;"]Fjr(x) and analogously we denote 
by C'"' = [x'^]C{x) the n-th Catalan number. Then we can express 
the desired average value as follows: 



i/!"^+E.>i(fc+i)(/i;\-/i"^) 

C(") 

-/!"^ - - - /a + »/^"^ + E.> Jfc + - .t^) 

- ... - /a + + E.> Jg^"' - ft^) 

C(") 



1 



C(") 



In the previous calculation we have used the fact that for k > n we 
always have Z^"-* — C'-"-'. 

It is sufhcient now to find the n-th term of the function 



fc>l k>l 



In what follows we want to find a function U which estimates U 
near the dominant singularity 1/4. According to [5], the n-th term of 
the Taylor expansion of U will provide an approximation of [x"']U(x). 

Let us fix X near 1 /4 and let us consider the threshold function 

Then, supposing k > kg, we have that 



1^ ^ 2fe+ia;'^+i / 1 



l-4a; y 2''^+i(l-4a;) 2'=+2(l-4x)' 
while if we suppose k < ko we will use the approximation 



l-4a; Y 2'=+i(l-4x) V 2^+1(1 - 4x) ' 
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For the fixed x near 1/4 we estiniate U{x) as follows: 



U{x) 



2Vl~4i V 2*^+1 



fc>i 



VI - Ax ^ 1 
'2(1 -4a:) ^ 2^ 

2V2' § V 2 

1 -\/2 + 2i 
2^2 -2 + \/2 
VI - Ax 



k>ko 



log2 



1-42; 



1 + 



2i-fco 
8Vl - 42; 



1 



1 



1 



2,^^,yT^(^--^ + iog,(VT^) + - 

Using the previous calculation we have the following result. 
Proposition 2 Let us denote 



U{x) 
then 



1 



2V2 



+ Vl - 4a; 



En{l) 



1 



^+log2(Vl-4.T) + - 



-]U{x) 



As a test one can consider the following table where, for several 
values of n, we compare the true En(j) with the approximation given 
by Proposition [5] 
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10 


20 


50 


100 


200 


500 


1000 


End) 


4.535 


5.120 


6.202 


7.107 


8.052 


9.334 


10.318 


lx"']U{x) 
CM) 


4.032 


5.109 


6.47 


7.490 


8.498 


9.824 


10.825 



We can go a step further in our approximation considering the 
following statement. 

Corollary 1 When n — > oo we have 

log2(") ^ 



8 



Proof. We use the result of Proposition [2] and the weU known asymp- 
totic behaviour of Catalan numbers: 



Furthermore, by standard technique (see again pi), we also calcu- 
late the behaviour of 



Vl - 42: log2 (Vl - 4a:) 



log2(") 



and 



Vl - 4a; ' 



Finally we have 



□ 



1 4" 

X 



1 

r X 



\/2 2%/^ 4 2V^ 
log2(")- 



log2 (n) \ ^J-KV? 



For n = 1000 in the previous table we have En{j) — 10.318 while 
log2(n) = 9.96578 which is quite close to the true value. 



5 Caterpillars in permutations Av{132) 

In Section[2]we have introduced caterpillars as objects related to planar 
rooted binary trees. We know that also the class of permutations 
avoiding the pattern 132 is enumerated by Catalan numbers. Indeed 
one can bijectively map the set Tn+i onto the set A'i;„(132), where the 
last symbol classicaly denotes the class of permutations of size n which 
are avoiding 132. In particular, in what follows, we will use a bijection 
(j) : Tn+i — >■ Awn(132) which works as described below. 

Take t e Tji+i and visit it according to the pre-order traversal 
labelling each node of outdegree two in decreasing order starting with 
the label n for the root. After this first step one has a tree labelled 
with integers at its nodes of outdegree two. Each leaf now collapses to 
its direct ancestor which takes a new label receiving on the left (resp. 
right) the label of its left (resp. right) child. We go on collapsing 
leaves until we achieve a tree made of one node which is labelled with 
a permutation of size n. See Fig. [3] for an instance of this mapping. 

Through (j) we can see how caterpillars can be interpreted inside 
permutations without the pattern 132. In order to do this we need the 
following definition. 
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Figure 3: The mapping (j). 



Let TT — TTiTT2 • • • 7r„ be a permutation. For a given entry tt^ we 
define r'7r(7ri) as the set made of those entries tt/c such that: 

1) TTfe < TT;; 

2) aU the entries in n which are between tt^ and tt^ are less than or 
equal to tt^. 

If TT = TTiTT2 . • • 7r„ is a permutation, then we define f^(7r.i) as the 
permutation one obtains extracting from tt the elements belonging to 
r^(7ri) respecting the order. The set of permutations (f7r(7ri))i=i...„ 
will then be denoted by f,^. 

As an example one can consider the permutation tt which is de- 
picted in Fig. m In this case f^r is made of 



Next proposition describes how caterpillars are realized inside per- 
mutations avoiding the pattern 132. It is interesting to see that the 
presence of such particular subtrees is connected to the property of 
avoiding the pattern 231. 

Propositions // t G %i+i O'^'d <t>{t) = tt = 7ri7r2 . . . 7r„, then the 
following hold: 

i) caterpillars subtrees oft correspond through <j) to those permuta- 
tions in avoiding the pattern 231; 

a) 7(t) — 1 corresponds to the size of the biggest permutation in 

^u(231) nf^. 



r.(3) 
f.(2) 



(1), 

(45312), 



(312), 

(1), 
(12), 



(453126), 
(45312687), 



10 



Figure 4: The permutation vr = (45312687). 



Proof. Label t according to the procedure (f>. If a node is labelled with 
m consider the subtree tm whose root is m. The nodes belonging to 
tm form the subsequence of tt made of the elements of r-,^{m). So we 
find the pattern 231 in fT^{m) if and only if we can find a node in i,„ 
having two descendants which are not leaves of t. It is now sufficient 
to observe that tm is a caterpillar if and only if it does not contain 
such a node. Summarizing, for every node m of t, t„i is a caterpillar 
subtree of size fc + 1 if and only if f.^{m) £ Avk{231). □ 

Using the results of Proposition |3l from the previous sections we 
can derive some properties of the permutations in f^r when tt avoids 
the pattern 132. These are stated in the next two corollaries. 

Corollary 2 The number of permutations tt e Av{132) such that all 
elements in f^r whose size is greater than one contain the pattern 231 
is given by 

F^jx) ^ _ l-2x-y/l-Ax + 8x^ 
X 2x 
The first terms of the sequence are: 

1, 0, 1, 2, 6, 16, 45, 126, 358, 1024, 2954, 8580, 25084, 73760, 218045. 

Remark: given tt — tti...tt2 we say that tt^ is a valley when 7rj„i 
and TTi-i (if they exist) are greater than tt^. Analogously tt^ is said to be 
a peak if both 7r.i_i and TTi^i exist and tt^^i < tt > TTi^i. In this sense, 
the permutations tt considered in Corollary [5] can be characterized, 
among those in Av{132), by the fact that each entry tt^ is either a valley 
or it is such that rTr{TTi) contains at least one peak. We also observe 
that sequence A025266 of [6j provides the same list of numbers of the 
previous corollary as those integers enumerating Motzkin paths with 
some constraints. 

Finally we state the following result which can be deduced from 
Corollary [TJ 



11 



Corollary 3 //tt G Av{132) has size n, the expected size of the biggest 
permutation in Av{231) fl is asymptotic to log2(n). 

6 Further works 

In the present paper we have focused our attention on the presence of 
caterpillars subtrees in planar rooted binary trees. As a second step 
we would like to investigate the case of non-planar rooted binary trees. 
We think that the approach we have used here could be refined in order 
to solve the non-planar case enumeration. 

Furthermore, we think that the realization of r„ for a given per- 
mutation TT corresponds to an interesting combinatorial object which 
should deserve further studies. 
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